Searching Documents with Semantically Related Keyphrases
نویسندگان
چکیده
In this paper, we present a tool, called SemKPSearch, for searching documents by a query keyphrase and keyphrases that are semantically related with that query keyphrase. By relating keyphrases semantically, we aim to provide users an extended search and browsing capability over a document collection and to increase the number of related results returned for a keyphrase query. Keyphrases provide a brief summary of the content of documents, and they can be either author assigned or automatically extracted from the documents. SemKPSearch uses a set of keyphrase indexes called SemKPIndex, and they are generated from the keyphrases of documents. In addition to a keyphrase-todocument index, SemKPIndex also contains a keyphrase-tokeyphrase index which stores semantic relation scores between the keyphrases in a document collection. The semantic relation score between keyphrases is calculated using a metric which considers the similarity score between words of the keyphrases, and the semantic similarity score between two words is determined with the help of two word-to-word semantic similarity metrics based on WordNet. SemKPSearch is evaluated by human evaluators, and the evaluation results showed that the evaluators found the documents retrieved with SemKPSearch more related to query terms than the documents retrieved with a search engine.
منابع مشابه
Coherent Keyphrase Extraction via Web Mining
Keyphrases are useful for a variety of purposes, including summarizing, indexing, labeling, categorizing, clustering, highlighting, browsing, and searching. The task of automatic keyphrase extraction is to select keyphrases from within the text of a given document. Automatic keyphrase extraction makes it feasible to generate keyphrases for the huge number of documents that do not have manually ...
متن کاملАвтоматическое Извлечение Ключевых Фраз На Основе Вычислительной Геометрии Для Решения Проблем Контекстного Поиска an Automatic Computational Geometric Key-phrase Extraction for Resolving the Contextual Retrieval Problem
Keyphrases are useful in a number of tasks such as summarizing, indexing, labelling and searching. The purpose of automatic keyphrase extraction is to select keyphrases from within the text of a given document. However, automatic key-phrase extraction it is possible to generate keyphrases for the vast number of documents that do not have manually assigned key-phrases of previous key-phrase extr...
متن کاملSingle-Document Keyphrase Extraction for Multi-Document Keyphrase Extraction
Here, we address the task of assigning relevant terms to thematically and semantically related sub-corpora and achieve superior results compared to the baseline performance. Our results suggest that more reliable sets of keyphrases can be assigned to the semantically and thematically related subsets of some corpora if the automatically determined sets of keyphrases for the individual documents ...
متن کاملAutomatic Documents Annotation by Keyphrase Extraction in Digital Libraries using Taxonomy
Keyphrases are useful for variety of purposes including: text clustering, classification, content-based retrieval, and automatic text summarization. A small amount of documents have author-assigned keyphrases. Manual assignment of the keyphrases to existing documents is a tedious task, therefore, automatic keyphrase extraction has been extensively used to organize documents. Existing automatic ...
متن کاملA Refined Methodology for Automatic Keyphrase Assignment to Digital Documents
AbstrAct: Keyphrases precisely express the primary topics and themes of documents and are valuable for cataloging and classification. Manually assigning keyphrases to existing documents is a tedious task; therefore, automatic keyphrase generation has been extensively used to classify digital documents. Existing automatic keyphrase generation algorithms are limited in assigning semantically rele...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012